AWR Analysis - Top 10 Foreground Events

In an ideal database CPU and I/O should be the top wait events.

If there are events like TX row lock contention and latch free then that means there is contention in your database.

The db file sequential read (which means index reads) should be higher as compared to db file scattered read (which means full table scans).

The following are 10 of the most common causes for wait events, along with explanations and potential solutions:

Cell Smart Table Scan The cell smart table scan event is what Oracle uses to account for time spent waiting for Full Table Scans that are offloaded. It is the most important new event on the Exadata platform. Its presence or absence can be used to verify whether a statement benefitted from Offloading or not.

Offloading only occurs when Oracle is able to do direct path reads. Consequently, this event replaces the direct path read event in most cases on Exadata. As with normal direct path reads, data is returned directly to the PGA of the requesing process on the database server. Blocks are not returned to the buffer cache

Exadata Smart Scan is a feature of the Database Machine. It offloads the data search and retrieval processing to the storage cell. Exadata Cell evaluate query predicates at the storage level to optimize the performance of certain classes of bulk data processing. For example the performance of queries that require a full table or index scans to evaluate selective predicates can be improved by pushing the database expression evaluations to the storage cell.

0. Cell Single Block Physical Read. This event is equivalent to the db file sequential read event used on non-Exadata platforms. Single block reads are used most often for index access paths (both the index block reads and the table block reads via rowids from the index lookups). They can also be used for a wide variety of other operations where it makes sense to read a single block.

If these are the only I/O wait events you see (and not together with multiblock reads) then it appears you are not using a full segment scan at all. Of course, sometimes single block reads show up due to other operations in the execution plan (like some index range scan) or due to chained rows in data blocks.

Exadata provides a large amount of flash cache (384G) on each storage cell. For that reason, physical reads (both multi-block and single-block) are considerably faster than on most disk-based storage systems.

1. DB File Scattered Read. This generally indicates waits related to full table scans. As full table scans are pulled into memory, they rarely fall into contiguous buffers but instead are scattered throughout the buffer cache. A large number here indicates that your table may have missing or suppressed indexes.

Although it may be more efficient in your situation to perform a full table scan than an index scan, check to ensure that full table scans are necessary when you see these waits.
Try to cache small tables to avoid reading them in repeatedly, since a full table scan is put at the cold end of the LRU (Least Recently Used) list.

2. DB File Sequential Read. This event generally indicates a single block read (an index read, for example). A large number of waits here could indicate poor joining orders of tables, or unselective indexing. It is normal for this number to be large for a high-transaction, well-tuned system, but it can indicate problems in some circumstances. You should correlate this wait statistic with other known issues within the Statspack report, such as inefficient SQL.

Check to ensure that index scans are necessary.
Check join orders for multiple table joins.
The DB_CACHE_SIZE will also be a determining factor in how often these waits show up. Problematic hash-area joins should show up in the PGA memory, but they're also memory hogs that could cause high wait numbers for sequential reads.
They can also show up as direct path read/write waits.

These circumstances are usually interrelated. When they occur in conjunction with the appearance of the db file scattered read and db file sequential read in the Top 5 Wait Events section.

First you should examine the SQL Ordered by Physical Reads section of the report, to see if it might be helpful to tune the statements with the highest resource usage.
Then, to determine whether there is a potential I/O bottleneck, examine the OS I/O statistics for corresponding symptoms.
Also look at the average time per read in the Tablespace and File I/O sections of the report. If many I/O-related events appear high in the Wait Events list, re-examine the host hardware for disk bottlenecks and check the host-hardware statistics for indications that a disk reconfiguration may be of benefit.

3. Free Buffer. This indicates your system is waiting for a buffer in memory, because none is currently available.

Waits in this category may indicate that you need to increase the DB_BUFFER_CACHE, if all your SQL is tuned.

Free buffer waits could also indicate that unselective SQL is causing data to flood the buffer cache with index blocks, leaving none for this particular statement that is waiting for the system to process. This normally indicates that there is a substantial amount of DML (insert/update/delete) being done and that the Database Writer (DBWR) is not writing quickly enough; the buffer cache could be full of multiple versions of the same buffer, causing great inefficiency.

To address this, you may want to consider accelerating incremental checkpointing, using more DBWR processes, or increasing the number of physical disks.

To investigate if this is an I/O problem, look at the File I/O Statistics.

4. Buffer Busy. This is a wait for a buffer that is being used in an unshareable way or is being read into the buffer cache.

Buffer busy waits should not be greater than 1%.

Check the Buffer Wait Statistics section (or V$WAITSTAT) to find out the type of wait:

Segment header ==> Increase the freelist groups or increase the pctused to pctfree gap.
Undo header ==> Add rollback segments.
Undo block ==> Reduce the data density on the table driving this consistent read or increase the value of the initialization parameter B_CACHE_SIZE.
Data block ==>

Move data to another block to avoid this hot block
Increase the freelists on the table
Use Locally Managed Tablespaces (LMTs).
Index block ==> Rebuild the index, partition the index, or use a reverse key index.

To prevent buffer busy waits related to data blocks, you can also use a smaller block size: fewer records fall within a single block in this case, so it's not as "hot."

When a DML occurs, Oracle Database writes information into the block, including all users who are "interested" in the state of the block (Interested Transaction List, ITL).

To decrease waits in this area:

Increase the INITRANS, which will create the space in the block to allow multiple ITL slots.
Increase the PCTFREE on the table where this block exists (this writes the ITL information up to the number specified by maxtrans, when there are not enough slots built with the initrans that is specified).

Brief Definition:

Buffer busy wait happens when a session wants to access a database block in the buffer cache, but it cannot as the buffer is "busy". This indicates that there are some buffers in the buffer cache that multiple processes are attempting to either access concurrently while its being read from disk or waiting for another session's block change to complete.

The two main cases where this can occur are the following:

Another session is reading the block into the buffer
Another session holds the buffer in an incompatible mode to our request

Problem Confirmation:

AWR/statspack report top timed event shows significant percentage of database time spent on this wait event.

Reducing Waits:

Identify the segments where buffer busy waits contention occurring using the reference notes.
Identify the block type from v$waitstat or from statspack/awr(section: Buffer Wait Statistics).

Freelist blocks/segment header:

Concurrent INSERTs with a suboptimal freelist configuration can lead to buffer busy wait contention as multiple sessions attempt to insert data into the same block (because it appears on the freelist to them). Heavy INSERT activity by concurrent sessions can cause multiple sessions to attempt their insert into the same blocks because automatic segment space management (ASSM) is NOT used, AND there is only a single freelist, too few process freelists, and/or no freelist groups.

The best solution is to use ASSM since it is sometimes tricky to arrive at a correct freelist or freelist group setting. Adding process freelists will help remove contention as each process will map to separate blocks. Freelists can be added at any time without rebuilding the table. Adding freelist groups will also remove contention by mapping processes to other freelists. This is of greatest benefit in RAC environments where the freelist group block itself will be associated with an instance, but will still help in single instance environments as well. The table must be rebuilt to change the freelist group setting.

Data blocks:

Concurrent INSERTs or updates may see contention when a related index has a key that is constantly increasing (e.g., a key based on a sequence number). Index leaf blocks may see contention due to key values that are increasing steadily (using a sequence) and concentrated in a leaf block on the "right-hand side" of the index. Look at using reverse key indexes (if range scans aren't commonly used against the segment). A reverse key index will spread keys around evenly and avoid creating these hot leaf blocks. However, the reverse key index will not be usable for index range scans, so care must be taken to ensure that access is normally done via equality predicates. Eliminate HOT blocks access from the application. Many concurrent physical reads against the same blocks will result in buffer busy waits as one session gets to do the actual physical read, and the others will be blocked by the buffer busy wait event until the read completes. This is usually an indication that the SQL statement must be tuned. Oracle's SQL Tuning Advisor can help tune specific SQL statements quickly and easily if you are licensed to use the Enterprise Manager Tuning Pack.

Undo header:

Use automatic undo management or add more rollback segments. The waits can be amplified greatly when physical reads are slow due to poor I/O subsystem performance.

Measuring Success:

Once you have applied the changes to resolve the issues you have found, compare the latest AWR to the AWR that led you here via Guided Resolution (that AWR becomes your baseline). Look at the percentage decrease total wait time for this event. If there are still issues, re- evaluate those and address them according to the specific symptom.

NOTE

Automatic segment space management (ASSM) is a simpler and more efficient way of managing space within a segment. It eliminates any need to specify and tune the pctused, freelists, and freelist groups storage parameters for schema objects created in the tablespace.

If any of these attributes are specified, they are ignored.

Note:

Oracle introduced Automatic Segment Storage Management (ASSM) as a replacement for traditional freelists management which used one-way linked lists to manage free blocks with tables and indexes.
ASSM is commonly called "bitmap freelists" because that is how Oracle implement the internal data structures for free block management.

Note:

Buffer busy waits are most associated with segment header contention inside the data buffer pool (db_cache_size, etc.).
The most common remedies for high buffer busy waits include database writer (DBWR) contention tuning, adding freelists (or ASSM), and adding missing indexes.

Block change tracking buffer space

When the session is waiting for the block change tracking buffer space event, it indicates that there is space waiting in the CTRW dba buffer.

Several causes for this wait event are:

Poor I/O performance on the disk where the change-tracking file resides.
CTWR dba buffer is too small to record the number of concurrent block changes.

Workings of CTWR:

When data blocks change, shadow processes track the changed blocks in a private area of memory at the same time they generate redo.
When a commit is issued, the BCT information is copied to a shared area in Large Pool called 'CTWR dba buffer'.
At the checkpoint, CTWR writes the information from the buffer to the change-tracking file.
If contention for space in the CTWR dba buffer occurs, a wait event called, 'Block Change Tracking Buffer Space' is recorded.

Library Cache Lock

The library cache load lock Oracle metric occurs when the process is waiting for the opportunity to load an object or a piece of an object into the library cache. The loads of SQL statements are serialized and locked in exclusive mode, such that only one process can load an object or a piece of an object at a time. In general, all library cache waits are associated with non-reentrant SQL or an undersized shared pool.

In general the library cache load lock occurs during periods of high activity within the library cache, especially with databases that do not use host variables and have a hard parsing due to non-reusable SQL.

5. Latch Free. Latches are low-level queuing mechanisms (they're accurately referred to as mutual exclusion mechanisms) used to protect shared memory structures in the system global area (SGA).

Latches are like locks on memory that are very quickly obtained and released. They are used to prevent concurrent access to a shared memory structure. If the latch is not available, a latch free miss is recorded.

Most latch problems are related to:

Failure to use bind variables(library cache latch)
Redo generation issues (redo allocation latch)
Buffer cache contention issues (cache buffers LRU chain)
Hot blocks in the buffer cache (cache buffers chain).

There are also latch waits related to bugs; check MetaLink for bug reports if you suspect this is the case.

When latch miss ratios are greater than 0.5%, you should investigate the issue.

If latch free waits are in the Top 5 Wait Events or high in the complete Wait Events list, look at the latch-specific sections of the AWR report to see which latches are contended for.

6. Enqueue. An enqueue is a lock that protects a shared resource.

Locks protect shared resources, such as data in a record, to prevent two people from updating the same data at the same time.

An enqueue includes a queuing mechanism, which is FIFO (first in, first out).

Note: Oracle's latching mechanism is not FIFO. Enqueue waits usually point to the following enqueues:

ST:Used for space management and allocation for dictionary-managed tablespaces. Use LMTs, or try to preallocate extents or at least make the next extent larger for problematic dictionary-managed tablespaces.
HW:The HW enqueue is used to manage the allocation of space beyond the high water mark of a segment. The high water mark of a segment is the boundary between used and unused space in that segment. If contention is occurring for "enq: HW - contention" it is possible that automatic extension is occuring to allow the extra data to be stored since the High Water Mark has been reached. Frequent allocation of extents, reclaiming chunks, and sometimes poor I/O performance may be causing contention for the LOB segments high water mark.
SQ:Indicates sequence (SQ) enqueues (lock) contention. When you select from a sequence, the NEXTVAL generated from a the seq$ table if it is not cached. If it is cached, it will be available in a memory structure and no need to generate the contention.
TM:Occur during DML to prevent DDL to the affected object. If you have foreign keys, be sure to index them to avoid this general locking issue.

TX4:Are the most common enqueue waits and they are usually the result of one of three issues:

Duplicates in a unique index; you need to commit/rollback to free the enqueue.
Multiple updates to the same bitmap index fragment. Since a single bitmap fragment may contain multiple rowids, you need to issue a commit or rollback to free the enqueue when multiple users are trying to update the same fragment.
Multiple users are updating the same block. If there are no free ITL slots, a block-level lock could occur. You can easily avoid this scenario by increasing the initrans and/or maxtrans to allow multiple ITL slots and/or by increasing the pctfree on the table.

How To Analyze the Wait Statistic: 'enq: HW - contention'

Given the following situation. This information can be retrieved from a statspack, ADDM, ASH report or v$session_wait query:

    Event % Wait Time P1 Parameter P1 Value P2 Parameter P2 Value Parameter 3 P3 Value 
    enq: HW - contention 62,81 name|mode 1213661190 table space 4 # block 17005691

1. Determine the correct file and block number:

select DBMS_UTILITY.DATA_BLOCK_ADDRESS_FILE(17005691) FILE#, 
DBMS_UTILITY.DATA_BLOCK_ADDRESS_BLOCK(17005691) BLOCK#
from dual;

FILE# BLOCK#
---------- ----------
4 228475

For bigfile tablespaces, do not use DBMS_UTILITY.DATA_BLOCK_ADDRESS_FILE, or you will get wrong results. Reference: https://docs.oracle.com/cd/E18283_01/appdev.112/e16760/d_util.htm#i1002531 In such a case, just use the tablespace# and assume p3 is the block number (there is no relative file number). 2. Determine the object to which this block belongs to

select owner, segment_type, segment_name
from dba_extents
where file_id = 4
and 228475 between block_id and block_id + blocks - 1;

OWNER SEGMENT_TYPE SEGMENT_NAME 
--------------- --------------- ------------------------------
SCOTT LOBSEGMENT EMP_DATA_LOB

Additionally, if the lock contention is currrently observed, we can find out the underlying segment using the following query:

    select DBMS_UTILITY.DATA_BLOCK_ADDRESS_FILE(ID2) FILE#, 
    DBMS_UTILITY.DATA_BLOCK_ADDRESS_BLOCK(ID2) BLOCK# 
    from v$lock 
    where type = 'HW';

As the 'enq HW - contention' may be caused by a number of different reasons, there are also several possible different solutions to alleviate or reduce contention.

Things to check are:-

Ensure that your lob segment is not frequently extending.
Check I/O performance.
A table containing a very busy lob segment may need partitioning in a manner that will evenly distribute concurrent DML across multiple partitions.
Frequent lob space/chunk reclaimation can also cause 'enq HW - contention'

In the case of point 4. there are a couple of options that may be able to be employed to provide either temporary relief or a workaround for the problem

Manually adding extra space to the LOB segment can alleviate the issue by allocating more free space to the lob segment to chunk reclaimation does not need to take place, until the free space is again used up
```
        ALTER TABLE  
        MODIFY LOB () (allocate extent (size )); 
        
```
** The following ALERT should be READ before manually allocating space to a LOB Segment
- NOTE 1229669.1 Bug 8198906 - Segment header corruption if extent allocation operation is interrupted
Using the shrink space command or dbms_redefinition process (for SECUREFILE LOBS) can be used to free up any reclaimable space.
```
ALTER TABLE test_lob MODIFY LOB (image) (SHRINK SPACE);
```
** The following documents should be READ before performing a LOB Shrink Operations
- Bug 5636728 - LOB corruption / ORA-1555 when reading LOBs after a SHRINK operation (Doc ID 5636728.8)
- Bug 5768710 - ALTER TABLE SHRINK slow with LOB (Doc ID 5768710.8)
When using Automatic Segment Space Management (ASSM), and the fix for Bug 6376915 has been applied in your database (Included in 10.2.0.4 +) it is possible to adjust the number of chunks that are cleaned up when the chunk cleanup operation is required.

This can be enabled by setting event 44951 to a value between 1 and 1024 (default is 1). With the value between 1 and 1024 setting the number of chunks to be cleaned up each time a chunk reclaimation operation occurs. This can therefore reduce the number of requests for the High Watermark Enqueue.

EVENT="44951 TRACE NAME CONTEXT FOREVER, LEVEL < 1 - 1024 >"

Refer to NOTE 6376915.8 "Bug 6376915 HW enqueue contention for ASSM LOB segments"

With Manual Segment Space Management, this value cannot be altered and is fixed at 128.

7. Log Buffer Space. This wait occurs because you are writing the log buffer faster than LGWR can write it to the redo logs, or because log switches are too slow.

To address this problem:

Increase the size of the log files
Increase the size of the log buffer
Get faster disks to write to where you might even consider using solid-state disks, for their high speed.

8. Log File Switch. All commit requests are waiting for logfile switch (archiving needed) or logfile switch (checkpoint Incomplete).

Ensure that the archive disk is not full or slow
DBWR may be too slow because of I/O
Add more or larger redo logs
Add database writers if the DBWR is the problem.

9. Log File Sync. When a user commits or rolls back data, the LGWR flushes the session's redo from the log buffer to the redo logs. The log file sync process must wait for this event to successfully complete.

To reduce wait events here:

Slow disk I/O: Segregating the redo log file onto separate disk spindles can reduce log file sync waits. Moving the online redo logs to fast SSD storage and increasing the log_buffer size above 10 megabytes (It is automatically set in 11g and beyond). If I/O is slow (timings in AWR or STATSPACK reports > 15 ms), then the only solution for log file sync waits is to improve I/O bandwidth.
LGWR is not getting enough CPU: If the vmstat runqueue column is greater than cpu_count, then the instance is CPU-bound and this can manifest itself in high log file sync waits. The solution is to tune SQL (to reduce CPU overhead), to add processors, or to 'nice' the dispatching priority of the LGWR process.
High COMMIT activity: A poorly-written application is issuing COMMIT s too frequently, causing high LGWR activity and high log file sync waits. The solution would be to reduce the frequency of COMMIT statements in the application.
LGWR is paged out: Check the server for RAM swapping, and add RAM if the instance processes are getting paged-out.
Bugs: There is also the possibility that bugs can cause high log file sync waits. In summmary, high log file sync waits can be caused either by too-high COMMIT frequency in the application, or by exhausted CPU, disk or RAM resources.

10. Idle Event. There are several idle wait events listed after the output; you can ignore them. Idle events are generally listed at the bottom of each section and include such things as SQL*Net message to/from client and other background-related timings. Idle events are listed in the stats$idle_event table.

11. Direct path writes: They occur when performing appends or data loads. In addition, they occur when you see full table scans or fast full index scans as of Oracle 11g

12. Direct path reads: Occur when there is a lot of parallel query activity

13. db file parallel writes / read: Occurs when there is a lot of partition activity; it could be a table or index partition

14. db file single write: Occurs if there is a high number of data files

15. Direct path read temp or Direct path write temp: Often this event is a result of an undersized temporary tablespace

These two events are triggered when a session is reading or writing buffers to or from temporay segments. These reads and writes are directly to or from the PGA. These occur because work areas are too large to fit in memory and are written to disk

These are the biggest waits for large data warehouse sites.

However, if the workload is not a DSS workload then examine why this is happening.

You can do so by taking the following actions:

Look at the SQL statement currently being run by the session experiencing waits to see what is causing the sorts. Query V$TEMPSEG_USAGE to find the SQL statement that is generating the sort.
Query the statistics from the session to determine the size of the sort.
Investigate whether it is possible to reduce the use of work areas by tuning the SQL statement.
Investigate whether to increase PGA_AGGREGATE_TARGET.

16. SQL*Net more message from dblink:A session waits while the server process receives messages over a database link from another server process.

This time could go up because:

Network bottleneck.
Time taken to execute the SQL on the remote node.
Note: This wait is also used when waiting for data from "extproc" or from a remote gateway process.

This wait blocks until a message is received from the remote connection (or until an abnormal end of file condition occurs on the underlying Net transport layer).

The blocker is the network plus the remote process. If the remote process is another database instance (accessed via a database link) then the information in View:V$SESSION on the REMOTE database can be used to help find the remote session for the database link connection. You should look at that remote session and determine where it is spending time. If system wide waits for this event are significant it is best to determine where the remote connections are to and switch attention to the remote instance / instances to determine where they are spending time.

One can also look at:

Sessions with high values in View:V$SESSTAT for:

SQL*Net roundtrips to/from dblink
bytes sent via SQL*Net to dblink
bytes received via SQL*Net from dblink

The Network between the local and remote systems (problems are usually related to time spent ON the remote instance rather than in the network but it is worth checking if the network between the instances is slow / not)

17. SQL*Net more data from client:Client is sending data to shadow that spans packets (think large data inserts, possibly large pre blocks, large SQL statements) Shadow waits for next packet.

There are a number of reasons why this might be happening; however, a few of them are:

Problem with client tool
There is a performance problem with the application server so Oracle is expecting more data to arrive more quickly than the app server can deliver it.
A network problem is affecting the delivery speed

A chatty application may use very frequent round trips to deliver data. Maybe the application design could be improved first?

Buffer Type Waits: So what's going on in your memory

cache buffer chains: Check for hot objects
free buffer waits:Insufficient buffers, process holding buffers too long or I/O subsystem is over loaded. Also check your db writes may be getting clogged up.
buffer busy waits: See what is causing them further along in report; most of the time its data block related.
gc buffer busy: Its in the RAC environment. Maybe caused by insufficient memory on your nodes or overloaded interconnect. Also look RAC specific section of the report
cache buffers lru chain: Freelist issues, hot blocks latch:
cache buffer handles: Freelist issues, hot blocks
buffer busy:See what is causing them further along in report
no free buffers: Insufficient buffers or dbwr contention

Log Type Waits

log file parallel write: Look for log file contention
log buffer space: Look at increasing log buffer size
log file switch (checkpoint incomplete): May indicate excessive db files or slow IO subsystem
log file switch (archiving needed):Indicates archive files are written too slowly
log file switch completion: May need more log files per
log file sync: Could indicate excessive commits

18. Read by other session

When a session waits on the �read by other session� event, it indicates a wait for another session to read the data from disk into the Oracle buffer cache because the block you need is not in the cache.

Excessive waits for this event are typically due to several processes repeatedly reading the same blocks, for example, sessions scanning the same index or performing tables scans on the same blocks.

The following query will give the blocks experiencing this wait event:

SELECT p1 "file#", p2 "block#", p3 "class#"
 FROM v$session_wait
 WHERE event = 'read by other session';

The query below will give the name and object type involved:

SELECT relative_fno, owner, segment_name, segment_type
      FROM dba_extents
      WHERE file_id = &file
      AND &block BETWEEN block_id AND block_id + blocks - 1;

Depending on the Oracle database environment and specific performance situation the following variety of methods can be used to eliminate contention:

Tune resource intensive SQL: Reduce the number of blocks queried
Redistribute data from the hot blocks: Make use of ASM
Adjust PCTFREE: Adjusting the PCTFREE value downward for an object will reduce the number of rows physically stored in a block.
Reduce the Block Size: The goal is to reduce the amount of data stored within one block.
Optimize indexes: A low cardinality index has a relatively small number of unique values, e.g. a column containing state data with only 50 values. Similar to inefficient queries, the use of a low cardinality index could cause excessive number of blocks to be read into the buffer cache and cause premature aging out of good blocks.

19. Row Cache Lock. The Row Cache or Data Dictionary Cache is a memory area in the shared pool that holds data dictionary information. Row cache holds data as rows instead of buffers. A Row cache enqueue lock is a lock on the data dictionary rows. It is used primarily to serialize changes to the data dictionary and to wait for a lock on a data dictionary cache. The enqueue will be on a specific data dictionary object. This is called the enqueue type and can be found in the v$rowcache view. Waits on this event usually indicate some form of DDL occurring, or possibly recursive operations such as storage management, sequence numbers incrementing frequently, etc. Diagnosing the cause of the contentionLatches are low-level queuing mechanisms (they're accurately referred to as mutual exclusion mechanisms) used to protect shared memory structures in the system global area (SGA).

Issues by Row Cache Lock Enqueue Type: For each enqueue type, there are a limited number of operations that require each enqueue. The enqueue type therefore may give an indication as the type of operation that may be causing the issue. As such some common reasons are outlined below along with SQL that helps to find the qnqueue type:

      select *
      from v$rowcache
      where cache# IN (select P1
      from dba_hist_active_sess_history
      where sample_time between to_date('26-MAR-14 12:49:00','DD-MON-YY HH24:MI:SS')
      and to_date('26-MAR-14 12:54:00','DD-MON-YY HH24:MI:SS')
      and event = 'row cache lock' )
      /

DC_SEQUENCES Caused by using sequences in simultaneous insert operations. => Consider caching sequences using the cache option. Especially important on RAC instances! Bug 6027068: Contention on ORA_TQ_BASE sequence -fixed in 10.2.0.5 and 11.2.0.1

DC_OBJECTS Look for any object compilation activity which might require an exclusive lock, blocking other activities. If object compiles are occurring this can require an exclusive lock which will block other activity. Tune by examining invalid objects and dependencies with following SQL:

select * from dba_objects order by last_ddl_time desc; 
select * from dba_objects where status = 'INVALID';

Can be a bug like the following ones: Bug 11070004: High row cache objects latch contention w/ oracle text queries Bug 11693365: Concurrent Drop table and Select on Reference constraint table hangs(deadlock) � fixed in 12.1 DC_SEGMENTS This is most likely due to segment allocation. Identify what the session holding the enqueue is doing and use errorstacks to diagnose.

DC_USERS + This may occur if a session issues a GRANT to a user, and that user is in the process of logging on to the database. + Excessive calls to dc_users can be a symptom of �set role XXXX� + You can check the presents of massive login attempts, even the failed ones by analyzing listener.log (use OEM 12c-> All Metrics or by checking database AUDIT if available or using own tools). : Bug 7715339 - Logon failures causes row cache lock waits - Allow disable of logon delay

DC_TABLESPACES Probably the most likely cause is the allocation of new extents. If extent sizes are set low then the application may constantly be requesting new extents and causing contention. Do you have objects with small extent sizes that are rapidly growing? (You may be able to spot these by looking for objects with large numbers of extents). Check the trace for insert/update activity, check the objects inserted into for number of extents.

DC_USED_EXTENTS and DC_FREE_EXTENTS This row cache lock wait may occur similar during space management operations where tablespaces are fragmented or have inadequate extent sizes. Tune by checking whether tablespaces are fragmented, extent sizes are too small, or tablespaces are managed manually.

DC_ROLLBACK_SEGMENTS + This is due to rollback segment allocation. Just like dc_segments, identify what is holding the enqueue and also generate errorstacks. Possible Bugs: + Bug 7313166 Startup hang with self deadlock on dc_rollback_segments (Versions BELOW 11.2) + Bug 7291739 Contention Under Auto-Tuned Undo Retention (Doc ID 742035.1)

DC_TABLE_SCNS Bug 5756769: Deadlock between Create MVIEW and DML - fixed in 10.2.0.5 ,11.1.07 and 11.2.0.1

DC_AWR_CONTROL This enqueue is related to control of the Automatic Workload Repository. As such any operation manipulating the repository may hold this so look for processes blocking these.

20. OS Thread Startup. This wait event might be seen if the database server is executing on a platform that supports multi-threading. We enter this waiting state while a thread is starting up and leave the wait state when the thread has started or the startup request is cancelled. This indicates some high contention at OS level avoiding even new process startup.

In the event of parallel execution, itf the event "os thread startup" becomes significant, then you may want to increase the value of the database parameter parallel_min_servers. This parameter determines the number of parallel servers that remain ready for new requests. The event "os thread startup" indicates that a SQL statement is waiting for parallel servers to be allocated, which takes time. After an operation is done using a parallel server, it will remain idle for 5 minutes. If no other SQL statement uses it, then the parallel server will be de-allocated after 5 idle minutes. It is recommended to change the value of parallel_min_servers from the default unless the event "os thread startup" has a significant amount of wait time.

rdbms ipc reply: Oracle metric event is used to wait for a reply from one of the background processes.

Make sure your parfile has the line COMMIT=N. Otherwise, Oracle will perform a commit after each row is inserted, and that may be the cause of your problems.
Enlarge the size of your redo till you are only performing 1 log switch every 10-20 minutes.
Shared memory segments passing messages (usually flags) back and forth. This is nothing to get worried about, and most people consider it to be an idle wait event.
Run a statspak report for 15 minutes during the import, and that will show where your problems really are, like hot blocks etc.

      PX Deq wait events
      Last update (2013-10-13 17:38:17)
      Date added (2010-08-05 13:08:09)
      Summary
      The main wait events related to parallel processing are:
      PX Deq: Table Q Normal
      PX Deq: Execute Reply
      PX Deq Credit: send blkd

      PX Deq: Table Q Normal
      Indicates that the slave wait for data to arrive on its input table queue. In a parallel execution environment we
      have a producer-consumer model. One slave set works on the data ( e.g. read data from disk , do a join ) called
      the producer slave set and the other slave set waits to get the data so can start the work. The slaves in this
      slave set are called consumer. The wait event "PX Deq: Table Q Normal" means that the slaves in the consumer slave
      have to wait for rows (data) from the other slave set so they can start their work.

      PX Deq: Execute Reply
      The QC is expecting a response (acknowledgment) to a control message from the slaves or is expecting to dequeue
      data from the producer slave set. This means he waits that the slaves finished to execute the SQL statement and
      that they send the result of the query back to the QC.

      PX Deq Credit: send blkd
      The wait events "PX Deq Credit: need buffer" and "PX Deq Credit: send blkd" are occur when data or messages are
      exchanged between process that are part of a px query.

Buffer Exterminate - Buffer exterminate wait event is caused when using Oracle's "automatic memory management" (AMM) when the MMON process shrinks the data buffer cache to re-allocate that RAM for another SGA region. The experience has indicated that AMM resize operations can hurt overall database performance especially the OLTP environments, and you may want to consider turning off AMM which will relieve the buffer exterminate waits, and manually adjusting your SGA regions.

If you see this in the TOP 5 Times Events, One has to look into v$sga_resize_ops and v$memory_resize_ops and see how many times it is occurring and effecting the performance of your database. If one sees more events of these and especially during your peak times of database one has to turn of the feature adjusting manually the corresponding SGA and PGA sizes.

If you want to analyze Oracle's use of memory and look at various memory resizing operations you can use the V$MEMORY_RESIZE_OPS view. This view contains a list of the last 800 SGA resize requests handled by Oracle. Here is an example:

SELECT parameter, initial_size, target_size, start_time FROM v$memory_resize_ops WHERE initial_size > = 0 and final_size > = 0 ORDER BY parameter, start_time;

This shows that Oracle has made a number of changes to the database cache and the shared pool, over a pretty short period of time. These changes will often decrease as the database stays up for a longer period of time, and you will often see changes as the load profile of the database changes, say from being report heavy to OLTP heavy. Please find below the various tables and thier descriptions to check the information in the database.

V$MEMORY_DYNAMIC_COMPONENTS - Displays information on the current size of all automatically tuned and static memory components, with the last operation (for example, grow or shrink) that occurred on each.
V$SGA_DYNAMIC_COMPONENTS -- Displays the current sizes of all SGA components, and the last operation for each component.
V$SGA_DYNAMIC_FREE_MEMORY -- Displays information about the amount of SGA memory available for future dynamic SGA resize operations.
V$MEMORY_CURRENT_RESIZE_OPS -- Displays information about resize operations that are currently in progress. A resize operation is an enlargement or reduction of the SGA, the instance PGA, or a dynamic SGA component.
V$SGA_CURRENT_RESIZE_OPS -- Displays information about dynamic SGA component resize operations that are currently in progress.
V$MEMORY_RESIZE_OPS -- Displays information about the last 800 completed memory component resize operations, including automatic grow and shrink operations for SGA_TARGET and PGA_AGGREGATE_TARGET.
V$SGA_RESIZE_OPS -- Displays information about the last 800 completed SGA component resize operations.

TCP Socket (KGAS)

KGAS is a element in the server which handles TCP/IP sockets which is typically used in dedicated connections i.e. by some PLSQL built in packages such as UTL_HTTP and UTL_TCP.

A session is waiting for an external host to provide requested data over a network socket.

The time that this wait event tracks does not indicate a problem, and even a long wait time is not a reason to contact Oracle Support.

It naturally takes time for data to flow between hosts over a network, and for the remote aspect of an application to process any request made to it. An application that communicates with a remote host must wait until the data it will read has arrived.

From an application/network point of view, delays in establishing a network connection may produce unwanted delays for users. We should make sure that the application makes network calls efficiently and that the network is working well such that these delays are minimized.

From the database point of view, these waits can safely be ignored; the wait event does not represent a database issue. It merely reports the total elapsed time for a network connection to be established or for data to arrive from over the network. The database waits for the connection to be established and reports the time taken. Its always good to check with the network or the third-party application vendors to investigate the underlying socket.

But in case of systemwide waits – If the TIME spent waiting for this event is significant then it is best to determine which sessions are showing the wait and drill into what those sessions are doing as the wait is usually related to whatever application pre is executing eg: What part of the application may be using UTL_HTTP or similar and is experiencing waits. This statement can be used to see which sessions may be worth tracing.

SELECT sid, total_waits, time_waited
FROM v$session_event WHERE event='TCP Socket (KGAS)' and total_waits>0 ORDER BY 3,2;

In order to reduce these waits or to help find the origin of the socket operations try:

Check the current SQL/module/action of V$SESSION for sessions that are waiting on the event at the time that they are waiting to try and identify any common area of application pre waiting on the event.
Get an ERRORSTACK level 3 dump of some sessions waiting on the event. This should help show the exact PLSQL and C call stacks invoking the socket operation if the dump is taken when the session is waiting. Customers may need assistance from Oracle Support in order to get and interpret such a dump but it can help pinpoint the relevant application pre. Trace sessions incurring the waits including wait tracing to try and place the waits in the context of the pre executing around the waits. eg: Use event 10046 level 8 or DBMS_MONITOR.SESSION_TRACE_ENABLE.
Use DBA_DEPENDENCIES to find any application packages which may ultimately be using UTL_HTTP or UTL_TCP underneath for some operation.

Example:

Execute the following SQL from a session on a dedicated connection and then check the resulting trace file to see “TCP Socket (KGAS)” waits:

alter session set events '10046 trace name context forever, level 8';

SQL*Net message from dblinkis about latency via a database link in a distributed environment.

It generally means that your local system is waiting on the network to transfer the data across the network.

It is a very normal wait event for this sort of query.

This time could go up for these reasons:

Network bottleneck.
Time taken to execute the SQL on the remote node.
This wait is also used when waiting for data from extproc or from a remote gateway process.

But in case of systemwide waits – If the TIME spent waiting for this event is significant then it is best to determine which sessions are showing the wait and drill into what those sessions are doing as the wait is usually related to whatever application pre is executing eg: What part of the application may be using a databade link and is experiencing waits. This statement can be used to see which sessions may be worth tracing.